GPT-5 Rubric Comparison Report

Comparing Original vs Reformulated (Dependency-Checked) Rubrics

677
Total Tasks
9
Unchanged Rubrics
668
Reformulated Rubrics
98.7%
Reformulation Rate
12
Segments
compositional_tasks_v2 (87 tasks, 87 changed)
flights (51 tasks, 51 changed)
hotels_head (52 tasks, 52 changed)
jobs (38 tasks, 38 changed)
price_comparison (57 tasks, 57 changed)
realestate_complex (48 tasks, 48 changed)
recipe_to_shopping (48 tasks, 48 changed)
restaurants_tail (52 tasks, 51 changed)
shopping_head (56 tasks, 56 changed)
shopping_lists_tail (51 tasks, 51 changed)
things_to_do (80 tasks, 72 changed)
ticketing (57 tasks, 57 changed)